Goto

Collaborating Authors

 martingale measure


On the grid-sampling limit SDE

arXiv.org Machine Learning

In our recent work [3] we introduced the grid-sampling SDE as a proxy for modeling exploration in continuous-time reinforcement learning. In this note, we provide further motivation for the use of this SDE and discuss its wellposedness in the presence of jumps.


A random measure approach to reinforcement learning in continuous time

arXiv.org Machine Learning

We present a random measure approach for modeling exploration, i.e., the execution of measure-valued controls, in continuous-time reinforcement learning (RL) with controlled diffusion and jumps. First, we consider the case when sampling the randomized control in continuous time takes place on a discrete-time grid and reformulate the resulting stochastic differential equation (SDE) as an equation driven by suitable random measures. The construction of these random measures makes use of the Brownian motion and the Poisson random measure (which are the sources of noise in the original model dynamics) as well as the additional random variables, which are sampled on the grid for the control execution. Then, we prove a limit theorem for these random measures as the mesh-size of the sampling grid goes to zero, which leads to the grid-sampling limit SDE that is jointly driven by white noise random measures and a Poisson random measure. We also argue that the grid-sampling limit SDE can substitute the exploratory SDE and the sample SDE of the recent continuous-time RL literature, i.e., it can be applied for the theoretical analysis of exploratory control problems and for the derivation of learning algorithms.


Deep Hedging: Learning to Remove the Drift under Trading Frictions with Minimal Equivalent Near-Martingale Measures

arXiv.org Machine Learning

We present a machine learning approach for finding minimal equivalent martingale measures for markets simulators of tradable instruments, e.g. for a spot price and options written on the same underlying. We extend our results to markets with frictions, in which case we find "near-martingale measures" under which the prices of hedging instruments are martingales within their bid/ask spread. By removing the drift, we are then able to learn using Deep Hedging a "clean" hedge for an exotic payoff which is not polluted by the trading strategy trying to make money from statistical arbitrage opportunities. We correspondingly highlight the robustness of this hedge vs estimation error of the original market simulator. We discuss applications to two market simulators.


Deep Hedging: Learning Risk-Neutral Implied Volatility Dynamics

arXiv.org Machine Learning

We present a numerically efficient approach for learning a risk-neutral measure for paths of simulated spot and option prices up to a finite horizon under convex transaction costs and convex trading constraints. This approach can then be used to implement a stochastic implied volatility model in the following two steps: 1. Train a market simulator for option prices, as discussed for example in our recent work Bai et al. (2019); 2. Find a risk-neutral density, specifically the minimal entropy martingale measure. The resulting model can be used for risk-neutral pricing, or for Deep Hedging (Buehler et al., 2019) in the case of transaction costs or trading constraints. To motivate the proposed approach, we also show that market dynamics are free from "statistical arbitrage" in the absence of transaction costs if and only if they follow a risk-neutral measure. We additionally provide a more general characterization in the presence of convex transaction costs and trading constraints. These results can be seen as an analogue of the fundamental theorem of asset pricing for statistical arbitrage under trading frictions and are of independent interest.